You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This PR implements Part 1 of #17797 for the table model SQL analyzer.
It allows explicit SELECT aliases to be referenced in GROUP BY and ORDER BY.
For example:
SELECT date_bin(1h, time) AS hour_time, AVG(s1) AS avg_s1
FROM table1
GROUP BY hour_time
ORDER BY hour_time;
The alias is resolved during analysis, so existing semantic checks still apply after alias resolution.
Alias precedence rules
This PR documents and implements the name resolution rules discussed in #17797:
GROUP BY prefers current-query input columns over SELECT aliases. If an unqualified name does not resolve to a local input column, it may resolve to a matching SELECT alias.
ORDER BY prefers SELECT output aliases over input columns. If no SELECT alias matches, it falls back to the existing ORDER BY name resolution behavior.
ORDER BY alias resolution also applies to SELECT DISTINCT, for example SELECT DISTINCT s1 AS x FROM table1 ORDER BY x.
Duplicate matching SELECT aliases are rejected with an ambiguity error.
I found two correctness issues in the alias resolution path:
GROUP BY input-column precedence currently treats outer-scope columns as input columns. resolvesToInputColumn() calls scope.tryResolveField(...).isPresent(), but that can resolve through the query boundary into a correlated outer scope. As a result, a correlated subquery such as:
SELECT x,
(SELECTCOUNT(*) FROM table1 GROUP BY x)
FROM table_with_x
or an inner query with SELECT expr AS x ... GROUP BY x can have GROUP BY x blocked from resolving to the inner SELECT alias just because the outer query has a column named x. The stated rule is that GROUP BY prefers input columns, which should mean the current query source scope, not outer query columns. This should check the resolved field is local to the current source scope, e.g. via ResolvedField.isLocal() / relation id, before suppressing alias resolution.
ORDER BY aliases that point to window functions are re-analyzed as window functions in the ORDER BY phase. For example:
SELECT row_number() OVER (ORDER BY s1) AS rn
FROM table1
ORDER BY rn
ORDER BY rn is rewritten to the row_number() OVER (...) expression, so the function is collected both in analysis.getWindowFunctions(node) and analysis.getOrderByWindowFunctions(orderBy). QueryPlanner then plans SELECT window functions first and ORDER BY window functions again after switching to the ORDER BY scope, producing duplicate window planning instead of ordering by the SELECT output alias. For ORDER BY alias references, the planner should reuse the SELECT output symbol / field reference rather than treating the alias target as a fresh ORDER BY window expression.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR implements Part 1 of #17797 for the table model SQL analyzer.
It allows explicit SELECT aliases to be referenced in
GROUP BYandORDER BY.For example:
The alias is resolved during analysis, so existing semantic checks still apply after alias resolution.
Alias precedence rules
This PR documents and implements the name resolution rules discussed in #17797:
GROUP BYprefers current-query input columns over SELECT aliases. If an unqualified name does not resolve to a local input column, it may resolve to a matching SELECT alias.ORDER BYprefers SELECT output aliases over input columns. If no SELECT alias matches, it falls back to the existingORDER BYname resolution behavior.ORDER BYalias resolution also applies toSELECT DISTINCT, for exampleSELECT DISTINCT s1 AS x FROM table1 ORDER BY x.Scope
This PR only handles Part 1 of #17797:
GROUP BYORDER BYThe following items are intentionally left out of scope for a follow-up PR:
WHEREHAVINGRefs #17797
This PR has:
Key changed/added classes (or packages if there are too many classes) in this PR
StatementAnalyzerSelectAliasReuseTestTestMetadataTest
./mvnw test -pl iotdb-core/datanode -am -Dtest=SelectAliasReuseTest -DfailIfNoTests=false -Dsurefire.failIfNoSpecifiedTests=false -DskipITsResult: